Extracting causal graphs from an open provenance data model

نویسندگان

  • Simon Miles
  • Paul T. Groth
  • Steve Munroe
  • Sheng Jiang
  • Thibaut Assandri
  • Luc Moreau
چکیده

The Open Provenance Architecture (OPA) approach to the challenge was distinct in several regards. In particular, it allows different components of the challenge workflow to independently record documentation, and for the workflow to be executed in different environments, made possible by an open, well-defined data model and architecture. Another noticeable feature is that we distinguish between the data recorded about what has occurred, process documentation, and the provenance of a data item, which is all that caused the data item to be as it is. In this view, provenance is obtained as the result of a query over process documentation. This distinction allows us to tailor the system to best address the separate requirements of recording and querying documentation. Other notable features include the explicit recording of causal relationships between both events and data items, an interaction-based world model, intensional definition of data items in queries rather than relying on explicit naming mechanisms, and styling of documentation to support non-functional application requirements such as reducing storage costs or ensuring privacy of data. In this paper, we describe how each of these features aid us in answering the challenge’s provenance queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Janus: From Workflows to Semantic Provenance and Linked Open Data

Data provenance graphs are form of metadata that can be used to establish a variety of properties of data products that undergo sequences of transformations, typically specified as workflows. Their usefulness for answering user provenance queries is limited, however, unless the graphs are enhanced with domain-specific annotations. In this paper we propose a model and architecture for semantic, ...

متن کامل

A A Formal Account of the Open Provenance Model

On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The Open Provenance Model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems....

متن کامل

Provenance for Data Mining

Data mining aims at extracting useful information from large datasets. Most data mining approaches reduce the input data to produce a smaller output summarizing the mining result. While the purpose of data mining (extracting information) necessitates this reduction in size, the loss of information it entails can be problematic. Specifically, the results of data mining may be more confusing than...

متن کامل

Approaches for Exploring and Querying Scientific Workflow Provenance Graphs

While many scientific workflow systems track and record data provenance, few tools have been developed that provide convenient and effective ways to access and explore this information. Two important ways for provenance information to be accessed and explored is through browsing (i.e., visualizing and navigating data and process dependencies) and querying (e.g., to select certain portions of pr...

متن کامل

Temporal Provenance Model (TPM): Model and Query Language

Provenance refers to the documentation of an object’s lifecycle. This documentation (often represented as a graph) should include all the information necessary to reproduce a certain piece of data or the process that led to it. In a dynamic world, as data changes, it is important to be able to get a piece of data as it was, and its provenance graph, at a certain point in time. Supporting time-a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Concurrency and Computation: Practice and Experience

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2008